Patent Document Retrieval and Classification at KAIST

نویسندگان

  • Jae-Ho Kim
  • Jin-Xia Huang
  • Ha-Yong Jung
  • Key-Sun Choi
چکیده

In this paper, we propose a method to retrieve similar patent documents for a given patent and classify a given patent. We focus on the one of patents’ characteristics: “patents are structuralized by claims, purposes, effects, embodiments of the invention and so on.” In order to retrieve similar documents from target document set, some specific components to denote the so-called ‘semantic elements’ such as “claim”, “purpose” and “application field” are compared instead of the whole texts. Keyword: Patent Retrieval, Patent Classification, Structural Information, kNN, MEM, Hierarchical Classification

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of Patent Retrieval Task at NTCIR-5

In the Fifth NTCIR Workshop, we organized the Patent Retrieval Task and performed three subtasks; Document Retrieval, Passage Retrieval, and Classification. This paper describes the Document Retrieval Subtask and Passage Retrieval Subtask, both of which were intended for patent-to-patent invalidity search task. We show the evaluation results of the groups participating in those subtasks.

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

Phrase-Based Document Categorization

(Chapter in Springer book ”Current Challenges in Patent Information Retrieval”, to appear in May 2011) This paper takes a fresh look at an old idea in Information Retrieval: the use of linguistically extracted phrases as terms in the automatic categorization of documents, and in particular the pre-classification of patent applications. In Information Retrieval, until now there was found little ...

متن کامل

Test Collections for Patent Retrieval and Patent Classification in the Fifth NTCIR Workshop

This paper describes the test collections produced for the Patent Retrieval Task in the Fifth NTCIR Workshop. We performed the invalidity search task, in which each participant group searches a patent collection for the patents that can invalidate the demand in an existing claim. For this purpose, we performed both document and passage retrieval tasks. We also performed the automatic patent cla...

متن کامل

POSTECH at NTCIR-5 Patent Retrieval: Smoothing Experiments in a Language Modeling Approach to Patent Retrieval

This report describes the experimental results of our participation at the Document Retrieval Subtask of NTCIR-5 Patent Retrieval Task. Unlike newspaper articles which belong to the main document type handled in previous information retrieval experiments, patent documents have many different characteristics in terms of length, technicality, structureness, etc. Among these, we focus on the lengt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005